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SEQUENCE LISTING 



IKn C5:vTER 16002900 



(1) GENERAL INFORMATION: 

(i) APPLICANT: National Starch and Chemical Investment Holding Co 
rporation 

(ii) TITLE OF INVENTION: Improvements in or Relating to Plant Sta 
rch Composition 

(iii) NUMBER OF SEQUENCES: 20 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: National Starch and Chemical Investment Holding C 
orporation 

(B) STREET: 1000 Uniqema Blvd. 

(C) CITY: Newcastle 

(D) STATE: Delaware 

(E) COUNTRY: United States of America 

(F) ZIP : 19720 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 57 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

AAGGATCCGT CGACATCGAT AATACGACTC ACTATAGGGA TTTTTTTTTT TTTTTTT 
57 

(2) INFORMATION FOR SEQ ID NO: 2: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 

AAGGATCCGT CGACATC 
17 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 

GACATCGATA ATACGAC 
17 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO 



CATCCAACCA CCATCTCGCA 
20 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 

TTGAGAGAAG ATACCTAAGT 
20 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6 

ATGTTCAGTC CATCTAAAGT 
20 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7 
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AGAACAACAA TTCCTAGCTC 
20 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8 

GGGGCCTTGA ACTCAGCAAT 
20 

<2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9 

CGTCCCAGCA TTCGACATAA 
20 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



Page 4 



1627D. txt 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

CTTGGATCCT TGAACTCAGC AATTTG 
26 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

TAACTCGAGC AACGCGATCA CAAGTTCGT 
29 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3003 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

GATGGGGCCT TGAACTCAGC AATTTGACAC TCAGTTAGTT ACACTGCCAT CACTTATCAG 
60 

ATCTCTATTT TTTCTCTTAA TTCCAACCAA GGAATGAATA AAAAGATAGA TTTGTAAAAA 
120 

CCCTAAGGAG AGAAGAAGAA AGATGGTGTA TACACTCTCT GGAGTTCGTT TTCCTACTGT 
180 

TCCATCAGTG TACAAATCTA ATGGATTCAG CAGTAATGGT GATCGGAGGA ATGCTAATAT 
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240 

TTCTGTATTC TTGAAAAAAC 
300 

CAATTCCGAA TCCCGACCTT 
360 

CCAGAGTGAT AGCTCCTCAT 
420 

AAATTCCCCA GCATCAACTG 
480 

AACTGAGAAC GATGACGTTG 
540 

TTTTGCTTCA TCACTACAAC 
600 

TACTTCTGAA GAGACAATTA 
660 

ACCTGGACTT GGTCAGAAGA 
720 

CCTTGATTAC AGGTATTCAC 
780 

TGGTTTGGAA GCTTTTTCTC 
840 

TATCACTTAC CGTGAGTGGG 
900 

CAATTGGGAC GCAAATGCTG 
960 

TCTGCCAAAT AATGTGGATG 
1020 

TATGGACACT CCATCAGGTG 
1080 

GCTTCCTGAT GAAATTCCAT 
1140 

TATCTTCCAA CACCCACGGC 



1627D.txt 

ACTCTCTTTC ACGGAAGATC 

CTACAATTGC AGCATCGGGG 

CCTCAACAGA TCAATTTGAG 

ATGTAGATAG TTCAACAATG 

AGCCGTCAAG TGATCTTACA 

TACAAGAAGG TGGTAAACTG 

TTGATGAATC TGATAGGATC 

TTTATGAAAT AGACCCCCTT 

AGTACAAGAA ACTGAGGGAG 

GTGGTTATGA AAGAATGGGT 

CTCCTGGTGC CCAGTCAGCT 

ACTTTATGAC TCGGAATGAA 

GTTCTCCTGC AATTCCTCAT 

TTAAGGATTC CATTCCTGCT 

ATAATGGAAT ATATTATGAT 

CAAAGAAACC AAAGTCGGTG 
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TTGGCTGAAA AGTCTTCTTA 
AAAGTCCTTG TGCCTGGAAT 
TTCGCTGAGA CATCTCCAGA 
GAACACGCTA GCCAGATTAA 
GGAAGTGTTG AAGAGCTGGA 
GAGGAGTCTA AAACATTAAA 
AGAGAGAGGG GCATCCCTCC 
TTGACAAACT ATCGTCAACA 
GCAATTGACA AGTATGAGGG 
TTCACTCGTA GTGCTACAGG 
GCCCTCATTG GGGATTTCAA 
TTTGGTGTCT GAGAGATTTT 
GGGTCCAGAG TGAAGATACG 
TGGATCAACT ACTCTTTACA 
CCACCCGAAG AGGAGAGGTA 
AGAATATATG AATCTCATAT 



1200 

TGGAATGAGT AGTCCGGAGC 
1260 

TCCTCGCATA AAAAAAGCTT 
1320 

CTTATTATGC TAGTTTTGGT 
1380 

GAACGCCCGA CGACCTTAAG 
1440 

TCATGGACAT TGTTCACAGC 
1500 

ACGGCACAGA TAGTTGTTAC 
1560 

TCCGCCTCTT TAACTATGGA 
1620 

GGTGGTTGGA TGAGTTCAAA 
1680 

GTACTCACCA CGGATTATCG 
1740 

CAACTGATGT GGATGCTGTT 
1800 

TCCCAGATGC AATTACCATT 
1860 

TTCAAGATGG GGGTGTTGGC 
1920 

TTGAGTTGCT CAAGAAACGG 
1980 

CAAATAGAAG ATGGTCGGAA 
2040 

TCGGTGATAA AACTATAGCA 
2100 

TGGATAGACC GTCAACATCA 
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CTAAAATTAA CTCATACGTG 

GGGTACAATG CGGTGCAAAT 

TATCATGTCA CAAATTTTTT 

TCTTTGATTG ATAAAGCTCA 

CATGCATCAA ATAATACTTT 

TTTCACTCTG GAGCTCGTGG 

AACTGGGAGG TACTTAGGTA 

TTTGATGGAT TTAGATTTGA 

GTGGGATTCA CTGGGAACTA 

GTGTATCTGA TGCTGGTCAA 

GGTGAAGATG TTAGCGGAAT 

TTTGACTATC GGCTGCATAT 

GATGAGGATT GGAGAGTGGG 

AAGTGTGTTT CATACGCTGA 

TTCTGGCTGA TGGACAAGGA 

TTAATAGATC GTGGGATAGC 
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AATTTTAGAG ATGAAGTTCT 
TATGGCTATT CAAGAGCATT 
TGCACCAAGC AGCCGTTTTG 
TGAGCTAGGA ATTGTTGTTC 
AGATGGACTG AACATGTTTG 
TTATCATTGG ATGTGGGATT 
TCTTCTCTCA AATGCGAGAT 
TGGTGTGACA TCAATGATGT 
CGAGGAATAC TTTGGACTCG 
CGATCTTATT CATGGGCTTT 
GCCGACATTT TGTGTTCCCG 
GGCAATTGCT GATAAATGGA 
TGATATTGTT CATACACTGA 
AAGTCATGAT CAAGCTCTAG 
TATGTATGAT TTTATGGCTC 
ATTACACAAG ATGATTAGGC 
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2160 

TTGTAACTAT GGGATTAGGA GGAGAAGGGT ACCTAAATTT CATGGGAAAT GAATTCGGCC 
2220 

ACCCTGAGTG GATTGATTTC CCTAGGGCTG AACAACACCT CTCTGATGGC TCAGTAATTC 
2280 

CCAGAAACCA ATTCAGTTAT GATAAATGCA GACGGAGATT TGACCTGGGA GATGCAGAAT 
2340 

ATTTAAGATA CCGTGGGTTG CAAGAATTTG ACCGGGCTAT GCAGTATCTT GAAGATAAAT 
2400 

ATGAGTTTAT GACTTCAGAA CACCAGTTCA TATCACGAAA GGATGAAGGA GATAGGATGA 
2460 

TTGTATTTGA AAAAGGAAAC CTAGTTTTTG TCTTTAATTT TCACTGGACA AAAGGCTATT 
2520 

CAGACTATCG CATAGGCTGC CTGAAGCCTG GAAAATACAA GGTTGCCTTG GACTCAGATG 
2580 

ATCCACTTTT TGGTGGCTTC GGGAGAATTG ATCATAATGC CGAATATTTC ACCTTTGAAG 
2640 

GATGGTATGA TGATCGTCCT CGTTCAATTA TGGTGTATGC ACCTAGTAGA ACAGCAGTGG 
2700 

TCTATGCACT AGTAGACAAA GAAGAAGAAG AAGAAGAAGA AGTAGCAGTA GTAGAAGAAG 
2760 

TAGTAGTAGA AGAAGAATGA ACGAACTTGT GATCGCGTTG AAAGATTTGA ACGCCACATA 
2820 

GAGCTTCTTG ACGTATCTGG CAATATTGCA TTAGTCTTGG CGGAATTTCA TGTGACAACA 
2880 

GGTTTGCAAT TCTTTCCACT ATTAGTAGTG CAACGATATA CGCAGAGATG AAGTGCTGAA 
2940 

CAAAAACATA TGTAAAATCG ATGAATTTAT GTCGAATGCT GGGACGATCG AATTCCTGCA 
3000 

GCC 

3003 

(2) INFORMATION FOR SEQ ID NO: 13: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2975 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

TTGATGGGCC TTGAACTCAG CAATTTGACA CTCAGTTAGT TACACTCCTA TCACTTATCA 
60 

GATCTCTATT TTTTCTCTTA ATTCCAACCA GGGGAATGAA TAAAAGGATA GATTTGTAAA 
120 

AACCCTAAGG AGAGAAGAAG AAAGATGGTG TATATACTCT CTGGAGTTCG TTTTCCTACT 
180 

GTTCCATCAG TGTACAAATC TAATGGATTC AGCAGTAATG GTGATCGGAG GAATGCTAAT 
240 

GTTTCTGTAT TCTTGAAAAA GCACTCTCTT TCACGGAAGA TCTTGGCTGA AAAGTCTTCT 
300 

TACAATTCCG AATTCCGACC TTCTACAGTT GCAGCATCGG GGAAAGTCCT TGTGCCTGGA 
360 

ACCCAGAGTG ATAGCTCCTC ATCCTCAACA GACCAATTTG AGTTCACTGA GACATCTCCA 
420 

GAAAATTCCC CAGCATCAAC TGATGTAGAT AGTTCAACAA TGGAACACGC TAGCCAGATT 
480 

AAAACTGAGA ACGATGACGT TGAGCCGTCA AGTGATCTTA CAGGAAGTGT TGAAGAGCTG 
540 

GATTTTGCTT CATCACTACA ACTACAAGAA GGTGGTAAAC TGGAGGAGTC TAAAACATTA 
600 

AATACTTCTG AAGAGACAAT TATTGATGAA TCTGATAGGA TCAGAGAGAG GGGCATCCCT 
660 

CCACCTGGAC TTGGTCAGAA GATTTATGAA ATAGACCCCC TTTTGACAAA CTATCGTCAA 
720 
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CACCTTGATT ACAGGTATTC 
780 

GGTGGTTTGG AAGCTTTTCT 
840 

GTATCACTTA CCGTGAGTGG 
900 

ACAATTGGGA CGCAAATGCT 
960 

TTCTGCCAAA TAATGTGGAT 
1020 

GTATGGACAC TCCATCAGGT 
1080 

AGCTTCCTGA TGAAATTCCA 
1140 

ATATCTTCCA ACACCCACGG 
1200 

TTGGAATGAG TAGTCCGGAG 
1260 

TTCCTCGCAT AAAAAAGCTT 
1320 

CTTATTATGC TAGTTTTGGT 
1380 

GAACGCCCGA CGACCTTAAG 
1440 

TCATGGACAT CGTTCACAGC 
1500 

ACGGCACCGA TAGTTGTTAC 
1560 

CCGCCTCTTT AACTATGGAA 
1620 

GTGGTTGGAT GAGTTCAAAT 
1680 



1627D. txt 

ACAGTACAAG AAACTGAGGG 

CGTGGTTATG AAAAAATGGG 

GCTCCTGGTG CCCAGTCAGC 

GACATTATGA CTCGGAATGA 

GGTTCTCCTG CAATTCCTCA 

GTTAAGGATT CCATTCCTGC 

TATAATGGAA TATATTATGA 

CCAAAGAAAC CAAAGTCGCT 

CCTAAAATTA ACTCATACGT 

GGGTACAATG CGCTGCGAAT 

TATCATGTCA CAAATTTTTT 

TCTTCGATTG ATAAAGCTCA 

CATGCATCAA ATAATACTTT 

TTTCACTCTG GAGCTCGTGG 

ACTGGGAGGT ACTTAGGTAT 

TTGATGGATT TAGATTCGAT 
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AGGCAATTGA CAAGTATGAG 
TTTCACTCGT AGTGCTACAG 
TGCCCTCATT GGAGATTTCA 
ATTTGGTGTC TGGGAGATTT 
TGGGTCCAGA GTGAAGATAC 
TTGGATCAAC TACTCTTTAC 
TCCACCCGAA GAGGAGAGGT 
GAGAATATAT GAATCTCATA 
GAATTTTAGA GATGAAGTTC 
TATGGCTATT CAAGAGCATT 
TGCACCAAGC AGCCGTTTTG 
TGAGCTAGGA ATTGTTGTTC 
AGATGGACTG AACATGTTTG 
TTATCATTGG ATGTGGGATT 
CTTCTCTCAA ATGCGAGATG 
GGTGTGACAT CAATGATGTA 



TACTCACCAC GGATTATCGG 
1740 

AACTGATGTG GATGCTGTTG 
1800 

CCCAGATGCA ATTACCATTG 
1860 

TCAAGATGGG GGTGTTGGCT 
1920 

TGAGTTGCTC AAGAAACGGG 
1980 

AAATAGAAGA TGGTCGGAAA 
2040 

CGGTGATAAA ACTATAGCAT 
2100 

GGATAGACCG CCAACATCAT 
2160 

TGTAACTATG GGATTAGGAG 
2220 

CCCTGAGTGG ATTGATTTCC 
2280 

CGGAAACCAA TTCAGTTATG 
2340 

TTTAAGATAC CATGGGTTAC 
2400 

TGAGTTTATG ACTTCAGAAC 
2460 

TGTATTTGAA AGAGGAAACC 
2520 

AGACTATCGC ATAGGCTGCC 
2580 

TCCACTTTTT GGTGGCTTCG 
2640 



1627D.txt 
TGGGATTCAC TGGGAACTAC 

TGTATCTGAT GCTGGTCAAC 

GTGAAGATGT TAGCGGAATG 

TTGACTATCG GCTGCATATG 

ATGAGGATTG GAGAGTGGGT 

AGTGTGTTTC ATACGCTGAA 

TCTGGCTGAT GGACAAGGAT 

TAATAGATCG TGGGATAGCA 

GAGAAGGGTA CCTAAATTTC 

CTAGGGCTGA GCCACACCTT 

ATAAATGCAG ACGGAGATTT 

AAGAATTTGA CTGGGCTATG 

ACCAGTTCAT ATCACGAAAG 

TAGTTTTCGT CTTTAATTTT 

TGAAGCCTGG AAAATACAAG 

GGAGAATTGA TCATAATGCC 
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GAGGAATACT TTGGACTCGC 
GATCTTATTC ATAGGCTTTT 
CCGACATTTT GTATTCCCGT 
GCAATTGCTG ATAAATGGAT 
GATATTGTTC ATACACTGAC 
AGTCATGATC AAGCTCTAGT 
ATGTATGATT TTATGGCTCT 
TTGCACAAGA TGATTAGGCT 
ATGGGAAATG AATTCGGCCA 
TCTGATGGCT CAGTAATTCC 
GACCTGGGAG ATGCAGAATA 
CAGTATCTTG AAGATAAATA 
GATGAAGGAG ATAGGATGAT 
CACTGGACAA ATAGCTATTC 
GTTGTCTTGG ACTCAGATGA 
GAATATTTCA CCTCTGAAGG 



1627D.txt 



ATCGTATGAT GATCGTCCTT GTTCAATTAT GGTGTATGCA CCTAGTAGAA CAGCAGTGGT 
2700 

CTATGCACTA GTAGACAAAC TAGAAGTAGC AGTAGTAGAA GAACCCATTG AAGAATGAAC 
2760 

GAACTTGTGA TCGCGTTGAA AGATTTGAAC GTTACTTGGT CATCCACATA GAGCTTCTTG 
2820 

ACATCAGTCT TGGCGGAATT GCATGTGACA ACAAGGTTTG CAGTTCTTTC CACTATTAGT 
2880 

AGTCCACCGA TATACGCAGA GATGAAGTGC TGAACAAACA TATGTAAAAT CGATGAATTT 
2940 

ATGTCGAATG CTGGGACGAT CGAATTCCTG CAGCC 
2975 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3033 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 145. .2790 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

TTGATGGGGC CTTGAACTCA GCAATTTGAC ACTCAGTTAG TTACACTCCT ATCACTTATC 
60 

AGATCTCTAT TTTTTCTCTT AATTCCAACC AAGGAATGAA TAAAAGGATA GATTTGTAAA 
120 

AACCCTAAGG AGAGAAGAAG AAAG ATG GTG TAT ACA CTC TCT GGA GTT CGT 
171 

Met Val Tyr Thr Leu Ser Gly Val Arg 
1 5 

TTT CCT ACT GTT CCA TCA GTG TAC AAA TCT AAT GGA TTC AGC AGT AAT 
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219 
Phe Pro Thr 
10 

GGT GAT CGG 
267 

Gly Asp Arg 



CTT TCA CGG 
315 

Leu Ser Arg 



CGA CCT TCT 
363 

Arg Pro Ser 
60 

CAG AGT GAT 
411 

Gin Ser Asp 
75 

ACA TCT CCA 
459 

Thr Ser Pro 

90 

ATG GAA CAC 
507 

Met Glu His 



TCA AGT GAT 
555 

Ser Ser Asp 



CTA CAA CTA 
603 

Leu Gin Leu 
140 

ACT TCT GAA 
651 

Thr Ser Glu 
155 



Val Pro 



Ser Val 
15 
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Tyr Lys Ser 



Asn Gly Phe 

20 



Ser Ser Asn 

25 



AGG AAT GCT AAT 

Arg Asn Ala Asn 
30 

AAG ATC TTG GCT 
Leu Ala 



GTT TCT GTA TTC TTG AAA AAG CAC TCT 
Phe Leu Lys 



Val Ser Val 
35 



Lys His Ser 
40 



Lys lie 
45 

ACA GTT 
Thr Val 

AGC TCC 
Ser Ser 

GAA AAT 
Glu Asn 



GAA AAG TCT TCT TAC AAT TCC GAA TTC 
Ser Tyr Asn 



Glu Lys Ser 
50 



Ser Glu Phe 
55 



GCA GCA 
Ala Ala 



TCG GGG AAA GTC CTT GTG CCT GGA ACC 

Pro Gly Thr 



Ser Gly Lys 
65 



Val Leu Val 
70 



TCA TCC TCA ACA GAC 

Ser Ser Ser Thr Asp 
80 



CAA TTT GAG TTC ACT GAG 

Gin Phe Glu Phe Thr Glu 
85 



TCC CCA GCA TCA ACT GAT GTA GAT AGT TCA ACA 
Ala Ser Thr 



Ser Pro 
95 



Asp Val Asp 
100 



Ser Ser Thr 
105 



GCT AGC CAG ATT 

Ala Ser Gin He 
110 

CTT ACA GGA AGT 
Gly Ser 



AAA ACT GAG AAC GAT GAC GTT GAG CCG 
Asn Asp Asp 



Lys Thr Glu 
115 



Val Glu Pro 
120 



Leu Thr 

125 

CAA GAA 
Gin Glu 

GAG ACA 
Glu Thr 



GTT GAA GAG CTG GAT TTT GCT TCA TCA 
Leu Asp Phe 



Val Glu Glu 

130 



Ala Ser Ser 
135 



GGT GGT 
Gly Gly 



AAA CTG GAG GAG TCT AAA ACA TTA AAT 

Thr Leu Asn 



Lys Leu Glu 
145 



ATT ATT GAT GAA TCT 
Asp Glu Ser 
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He He 
160 



Glu Ser Lys 
150 



GAT AGG ATC AGA GAG AGG 
Arg Glu Arg 



Asp Arg He 
165 



GGC ATC CCT 

699 
Gly lie Pro 
170 

CTT TTG ACA 
747 

Leu Leu Thr 



AAG AAA CTG 

795 
Lys Lys Leu 



TTT TCT CGT 

843 
Phe Ser Arg 
220 

ATC ACT TAC 
891 

He Thr Tyr 

235 

GGA GAT TTC 

939 

Gly Asp Phe 
250 

GAA TTT GGT 
987 

Glu Phe Gly 



CCT GCA ATT 

1035 
Pro Ala He 



TCA GGT GTT 

1083 
Ser Gly Val 
300 

CTT CCT GAT 
1131 



CCA CCT 
Pro Pro 
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GGA CTT GGT CAG AAG ATT 



Gly Leu 
175 



Gly Gin Lys He 
180 



TAT GAA ATA GAC CCC 

Tyr Glu He Asp Pro 
18 5 



AAC TAT CGT CAA 

Asn Tyr Arg Gin 
190 

AGG GAG GCA ATT 
Ala He 



CAC CTT GAT TAC AGG TAT TCA CAG TAC 
Arg Tyr 



His Leu Asp Tyr 
195 



Ser Gin Tyr 
200 



Arg Glu 

205 

GGT TAT 
Gly Tyr 

CGT GAG 
Arg Glu 

AAC AAT 
Asn Asn 



GAC AAG TAT GAG GGT GGT TTG GAA GCC 
Gly Gly 



Asp Lys Tyr Glu 
210 



Leu Glu Ala 
215 



GAA AAA 
Glu Lys 



ATG GGT TTC ACT CGT AGT GCT ACA GGT 

Ala Thr Gly 



Met Gly Phe Thr 
225 



Arg Ser 
230 



TGG GCT CTT GGT GCC CAG 

Trp Ala Leu Gly Ala Gin 
240 



TCA GCT GCC CTC ATT 

Ser Ala Ala Leu He 
245 



TGG GAC GCA AAT GCT GAC ATT ATG ACT CGG AAT 

He Met 



Trp Asp 

255 



Ala Asn Ala Asp 
260 



Thr Arg Asn 
265 



GTC TGG GAG ATT 

Val Trp Glu He 
270 

CCT CAT GGG TCC 
Gly Ser 



TTT CTG CCA AAT AAT GTG GAT GGT TCT 
Asn Val 



Phe Leu Pro Asn 
275 



Asp Gly Ser 
280 



Pro His 
285 

AAG GAT 
Lys Asp 



AGA GTG AAG ATA CGT ATG GAC ACT CCA 
Arg Met 



Arg Val Lys lie 

290 



Asp Thr Pro 
295 



TCC ATT 
Ser He 



CCT GCT TGG ATC AAC TAC TCT TTA CAG 

Ser Leu Gin 



Pro Ala Trp He 
305 



Asn Tyr 
310 



GAA ATT CCA TAT AAT GGA ATA CAT TAT GAT CCA CCC GAA 
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Leu Pro Asp 
315 

GAG GAG AGG 

1179 
Glu Glu Arg 
330 

CTG AGA ATA 

1227 
Leu Arg lie 



ATT AAC TCA 

1275 
lie Asn Ser 



Glu He 

TAT ATC 
Tyr He 



Pro Tyr 
320 
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Asn Gly lie 



His Tyr Asp 
325 



Pro Pro Glu 



TTC CAA CAC CCA CGG CCA AAG AAA CCA AAG TCG 
His Pro Arg 



Phe Gin 
335 



Pro Lys Lys 
34 0 



Pro Lys Ser 
345 



TAT GAA TCT CAT 

Tyr Glu Ser His 
350 

TAC GTG AAT TTT 
Asn Phe 



ATT GGA ATG AGT AGT CCG GAG CCT AAA 
Ser Ser Pro 



He Gly Met 
355 



Glu Pro Lys 
360 



AAG 
Lys 

TAT 
Tyr 

AGC 

Ser 
410 

CAT 

His 



CTT GGG 
1323 
Leu Gly 
380 

TAC GCT 
1371 
Tyr Ala 
395 

CGT TTT 
1419 
Arg Phe 



GAG CTA 
1467 
Glu Leu 



Tyr Val 
365 

TAC AAT 
Tyr Asn 

AGT TTT 
Ser Phe 

GGA ACG 
Gly Thr 



AGA GAT GAA GTT CTT CCT CGC ATA AAA 
Val Leu Pro 



Arg Asp Glu 
370 



Arg He Lys 
375 



GCG CTG 
Ala Leu 



CAA ATT ATG GCT ATT CAA GAG CAT TCT 

Glu His Ser 



Gin He Met 
385 



GGT TAT CAT GTC ACA 

Gly Tyr His Val Thr 
400 



Ala He Gin 

390 



AAT TTT TTT GCA CCA AGC 

Asn Phe Phe Ala Pro Ser 
405 



CCC GAC GAC CTT AAG TCT TTG ATT GAT AAA GCT 
Asp Leu Lys 



Pro Asp 
415 



Ser Leu He 
420 



Asp Lys Ala 
425 



GGA ATT GTT GTT CTC ATG GAC 
Val Val 



Gly He 
430 



Leu Met Asp 
435 



ATT GTT CAC AGC CAT GCA 

He Val His Ser His Ala 
440 



TCA AAT AAT 

1515 
Ser Asn Asn 



TGT TAC TTT 

1563 
Cys Tyr Phe 
4 60 



ACT TTA GAT GGA 

Thr Leu Asp Gly 
445 



CTG AAC ATG TTT GAC TGC ACC GAT AGT 
Phe Asp Cys 



Leu Asn Met 
450 



Thr Asp Ser 
455 



CAC TCT 
His Ser 



GGA GCT 
Gly Ala 



CGT GGT TAT CAT TGG ATG TGG GAT TCC 

Trp Asp Ser 



Arg Gly Tyr 
465 



His Trp Met 
470 
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CGC CTC TTT AAC TAT GGA AAC TGG GAG GTA CTT AGG TAT CTT CTC TCA 
1611 

Arg Leu Phe Asn Tyr Gly Asn Trp Glu Val Leu Arg Tyr Leu Leu Ser 
475 480 485 

AAT GCG AGA TGG TGG TTG GAT GCG TTC AAA TTT GAT GGA TTT AGA TTT 
1659 

Asn Ala Arg Trp Trp Leu Asp Ala Phe Lys Phe Asp Gly Phe Arg Phe 

490 495 500 505 

GAT GGT GTG ACA TCA ATG ATG TAT ATT CAC CAC GGA TTA TCG GTG GGA 
1707 

Asp Gly Val Thr Ser Met Met Tyr He His His Gly Leu Ser Val Gly 
510 515 520 

TTC ACT GGG AAC TAC GAG GAA TAC TTT GGA CTC GCA ACT GAT GTG GAT 
1755 

Phe Thr Gly Asn Tyr Glu Glu Tyr Phe Gly Leu Ala Thr Asp Val Asp 

525 530 535 

GCT GTT GTG TAT CTG ATG CTG GTC AAC GAT CTT ATT CAT GGG CTT TTC 
1803 

Ala Val Val Tyr Leu Met Leu Val Asn Asp Leu He His Gly Leu Phe 
540 " 545 550 

CCA GAT GCA ATT ACC ATT GGT GAA GAT GTT AGC GGA ATG CCG ACA TTT 
1851 

Pro Asp Ala He Thr He Gly Glu Asp Val Ser Gly Met Pro Thr Phe 

555 560 565 

TGT ATT CCC GTC CAA GAG GGG GGT GTT GGC TTT GAC TAT CGG CTG CAT 
1899 

Cys He Pro Val Gin Glu Gly Gly Val Gly Phe Asp Tyr Arg Leu His 
570 575 580 585 

ATG GCA ATT GCT GAT AAA CGG ATT GAG TTG CTC AAG AAA CGG GAT GAG 
1947 

Met Ala He Ala Asp Lys Arg He Glu Leu Leu Lys Lys Arg Asp Glu 
590 595 600 

GAT TGG AGA GTG GGT GAT ATT GTT CAT ACA CTG ACA AAT AGA AGA TGG 
1995 

Asp Trp Arg Val Gly Asp He Val His Thr Leu Thr Asn Arg Arg Trp 
605 610 615 

TCG GAA AAG TGT GTT TCA TAC GCT GAA AGT CAT GAT CAA GCT CTA GTC 
2043 

Ser Glu Lys Cys Val Ser Tyr Ala Glu Ser His Asp Gin Ala Leu Val 
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620 

GGT GAT AAA 

2091 
Gly Asp Lys 

635 

TTT ATG GCT 

2139 
Phe Met Ala 
650 

GCA TTG CAC 

2187 
Ala Leu His 



GGG TAC CTA 

2235 
Gly Tyr Leu 



ACT ATA 
Thr He 

CTG GAT 
Leu Asp 



1627D.txt 
625 

GCA TTC TGG CTG ATG 

Ala Phe Trp Leu Met 

640 

AGA CCG TCA ACA TCA 
Ser Thr Ser 



630 

GAC AAG GAT ATG TAT GAT 

Asp Lys Asp Met Tyr Asp 
64 5 

TTA ATA GAT CGT GGG ATA 



Arg Pro 
655 



Leu He Asp Arg 
660 



Gly He 
665 



AAG ATG ATT AGG 

Lys Met He Arg 

67 0 

AAT TTC ATG GGA 

Asn Phe Met Gly 
685 



CTT GTA ACT ATG GGA TTA GGA GGA GAA 
Met Gly Leu Gly 



GAT 
Asp 

GGA 
Gly 

GAT 

Asp 
730 

ATG 

Met 



TTC CCT 
2283 
Phe Pro 
700 

AAC CAA 
2331 
Asn Gin 
715 

GCA GAA 
2379 
Ala Glu 



CAG TAT 
2427 
Gin Tyr 



AGG GCT 
Arg Ala 

TTC AGT 
Phe Ser 

TAT TTA 
Tyr Leu 



GAA CAA 
Glu Gin 



Leu Val Thr 

675 



AAT GAA TTC GGC CAC 

Asn Glu Phe Gly His 
690 

CAC CTC TCT GAT GGC 
Asp Gly 



Gly Glu 
680 



His Leu Ser 
705 



CCT GAG TGG ATT 

Pro Glu Trp lie 
695 

TCA GTA ATC CCC 

Ser Val He Pro 
710 



TAT GAT AAA TGC AGA 

Tyr Asp Lys Cys Arg 
720 

AGA TAC CGT GGG TTG 
Arg Gly Leu 



CGG AGA TTT GAC CTG GGA 

Arg Arg Phe Asp Leu Gly 
725 



Arg Tyr 
735 



CAA GAA TTT GAC CGG CCT 
Phe Asp 



Gin Glu 
740 



Arg Pro 
745 



TTC ATA TCA 

2475 
Phe He Ser 



CTT GAA GAT AAA 

Leu Glu Asp Lys 
750 

CGA AAG GAT GAA 
Asp Glu 



TAT GAG TTT ATG ACT 

Tyr Glu Phe Met Thr 
7 55 



TCA GAA CAC CAG 

Ser Glu His Gin 
760 



Arg Lys 
7 65 



GGA GAT AGG ATG ATT GTA TTT GAA AAA 

Glu Lys 



GGA AAC CTA GTT TTT GTC TTT 



Gly Asp Arg 
770 

AAT TTT CAC 
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Met He Val Phe 

775 



TGG ACA AAA AGC TAT TCA 



1627D.txt 

2523 

Gly Asn Leu Val Phe Val Phe Asn Phe His Trp Thr Lys Ser Tyr Ser 
780 785 790 

GAC TAT CGC ATA GCC TGC CTG AAG CCT GGA AAA TAC AAG GTT GCC TTG 
2571 

Asp Tyr Arg He Ala Cys Leu Lys Pro Gly Lys Tyr Lys Val Ala Leu 
795 800 805 

GAC TCA GAT GAT CCA CTT TTT GGT GGC TTC GGG AGA ATT GAT CAT AAT 
2619 

Asp Ser Asp Asp Pro Leu Phe Gly Gly Phe Gly Arg He Asp His Asn 

810 '~ 815 820 825 

GCC GAA TAT TTC ACC TTT GAA GGA TGG TAT GAT GAT CGT CCT CGT TCA 
2667 

Ala Glu Tyr Phe Thr Phe Glu Gly Trp Tyr Asp Asp Arg Pro Arg Ser 
830 835 840 

ATT ATG GTG TAT GCA CCT TGT AAA ACA GCA GTG GTC TAT GCA CTA GTA 
2715 

He Met Val Tyr Ala Pro Cys Lys Thr Ala Val Val Tyr Ala Leu Val 
845 850 855 

GAC AAA GAA GAA GAA GAA GAA GAA GAA GAA GAA GAA GAA GTA GCA GCA 
2763 

Asp Lys Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Glu Val Ala Ala 
860 865 870 

GTA GAA GAA GTA GTA GTA GAA GAA GAA TGAACGAACT TGTGATCGCG 
2810 

Val Glu Glu Val Val Val Glu Glu Glu 

875 880 

TTGAAAGATT TGAACGCTAC ATAGAGCTTC TTGACGTATC TGGCAATATT GCATCAGTCT 
2870 

TGGCGGAATT TCATGTGACA CAAGGTTTGC AATTCTTTCC ACTATTAGTA GTGCAACGAT 
2930 

ATACGCAGAG ATGAAGTGCT GAACAAACAT ATGTAAAATC GATGAATTTA TGTCGAATGC 
2990 

TGGGACGATC GAATTCCTGC AGGCCGGGGG ACCCCTTAGT TCT 
3033 



(2) INFORMATION FOR SEQ ID NO: 15: 
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1627D.txt 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 882 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



Met Val 
1 

Tyr Lys 
Val Ser 



Glu Lys 
50 

Ser Gly 
65 

Ser Thr 



Ala Ser 

Lys Thr 

Val Glu 
130 

Lys Leu 
145 

Asp Glu 
Gly Gin 
His Leu 



Tyr Thr Leu 

5 

Ser Asn Gly 
20 

Val Phe Leu 
35 

Ser Ser Tyr 

Lys Val Leu 

Asp Gin Phe 
85 

Thr Asp Val 
100 

Glu Asn Asp 
115 

Glu Leu Asp 
Glu Glu Ser 



Ser Asp Arg 
165 

Lys lie Tyr 
180 

Asp Tyr Arg 
195 



Ser Gly Val 



Phe Ser Ser 



Arg Phe Pro Thr 
10 

Asn Gly Asp Arg 
25 



Lys Lys His Ser Leu Ser Arg 
4 0 



Asn Ser 
55 

Val Pro 
70 



Glu 



Gly 



Glu Phe Thr 



Asp Ser 
Asp Val 



Phe Ala 
135 

Lys Thr 
150 



Ser 



Glu 
120 

Ser 



Leu 



He Arg Glu 



Glu He Asp 



Phe Arg Pro Ser 
60 

Thr Gin Ser Asp 
75 

Glu Thr Ser Pro 
90 

Thr Met Glu His 
105 

Pro Ser Ser Asp 



Ser Leu Gin Leu 
140 

Asn Thr Ser Glu 
155 

Arg Gly He Pro 
170 

Pro Leu Leu Thr 
185 



Tyr Ser Gin Tyr Lys Lys Leu 
2 00 



Val Pro Ser Val 
15 

Arg Asn Ala Asn 
30 

Lys He Leu Ala 
45 

Thr Val Ala Ala 



Ser Ser Ser Ser 



Glu Asn Ser Pro 
95 

Ala Ser Gin He 
110 

Leu Thr Gly Ser 
125 

Gin Glu Gly Gly 



Glu Thr He He 
160 

Pro Pro Gly Leu 
175 

Asn Tyr Arg Gin 
190 

Arg Glu Ala He 

205 
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Asp Lys Tyr Glu Gly Gly Leu Glu Ala Phe Ser Arg Gly Tyr Glu Lys 
210 " ~ 215 220 

Met Gly Phe Thr Arg Ser Ala Thr Gly He Thr Tyr Arg Glu Trp Ala 
225 230 235 240 

Leu Gly Ala Gin Ser Ala Ala Leu He Gly Asp Phe Asn Asn Trp Asp 
245 250 255 

Ala Asn Ala Asp He Met Thr Arg Asn Glu Phe Gly Val Trp Glu He 
260 265 270 

Phe Leu Pro Asn Asn Val Asp Gly Ser Pro Ala He Pro His Gly Ser 
275 ' 280 285 

Arg Val Lys He Arg Met Asp Thr Pro Ser Gly Val Lys Asp Ser He 
290 295 300 

Pro Ala Trp He Asn Tyr Ser Leu Gin Leu Pro Asp Glu He Pro Tyr 
305 310 315 320 

Asn Gly He His Tyr Asp Pro Pro Glu Glu Glu Arg Tyr He Phe Gin 
325 330 335 

His Pro Arg Pro Lys Lys Pro Lys Ser Leu Arg He Tyr Glu Ser His 
340 ~ ' 345 350 

He Gly Met Ser Ser Pro Glu Pro Lys lie Asn Ser Tyr Val Asn Phe 
355 360 365 

Arg Asp Glu Val Leu Pro Arg He Lys Lys Leu Gly Tyr Asn Ala Leu 
370 375 380 

Gin He Met Ala lie Gin Glu His Ser Tyr Tyr Ala Ser Phe Gly Tyr 
385 390 395 400 

His Val Thr Asn Phe Phe Ala Pro Ser Ser Arg Phe Gly Thr Pro Asp 
405 410 415 

Asp Leu Lys Ser Leu lie Asp Lys Ala His Glu Leu Gly lie Val Val 
420 425 430 

Leu Met Asp lie Val His Ser His Ala Ser Asn Asn Thr Leu Asp Gly 
435 440 445 

Leu Asn Met Phe Asp Cys Thr Asp Ser Cys Tyr Phe His Ser Gly Ala 
450 ' " 455 460 
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Arg Gly Tyr 
465 

Trp Glu Val 

Ala Phe Lys 

Tyr He His 
515 

Tyr Phe Gly 
530 

Val Asn Asp 
545 

Glu Asp Val 

Gly Val Gly 

He Glu Leu 
595 

Val His Thr 
610 

Ala Glu Ser 
625 

Trp Leu Met 
Ser Thr Ser 



Leu Val Thr 
675 

Asn Glu Phe 
690 

His Leu Ser 
705 



His Trp 



Leu Arg 
485 

Phe Asp 
500 



Met Trp 
470 

Tyr Leu 
Gly Phe 



1627D.txt 
Asp Ser Arg Leu 
475 



Phe Asn 



Leu Ser Asn Ala Arg Trp 
490 

Arg Phe Asp Gly Val Thr 
505 



His Gly Leu Ser 



Leu Ala 

Leu He 

Ser Gly 
565 

Phe Asp 
580 

Leu Lys 

Leu Thr 

His Asp 

Asp Lys 
64 5 

Leu He 
660 

Met Gly 
Gly His 
Asp Gly 



Thr Asp 
535 

His Gly 
550 



Val Gly Phe Thr 
520 

Val Asp Ala Val 



Gly Asn 
525 

Val Tyr 
540 



Leu Phe Pro Asp Ala He 
555 



Met Pro Thr Phe Cys He 
570 



Pro Val 



Tyr Arg 

Lys Arg 

Asn Arg 
615 

Gin Ala 

630 

Asp Met 
Asp Arg 
Leu Gly 



Pro Glu 
695 

Ser Val 
710 



Leu His Met Ala lie Ala 
585 



Asp Glu Asp Trp 
600 

Arg Trp Ser Glu 



Arg Val 
605 

Lys Cys 
620 



Tyr Gly Asn 
480 

Trp Leu Asp 
495 

Ser Met Met 
510 

Tyr Glu Glu 

Leu Met Leu 

Thr He Gly 
560 

Gin Glu Gly 
575 

Asp Lys Arg 
590 

Gly Asp He 
Val Ser Tyr 



Leu Val Gly Asp Lys Thr 
635 

Tyr Asp Phe Met Ala Leu 
650 

Gly lie Ala Leu His Lys 
665 



Gly Glu Gly Tyr 
680 

Trp He Asp Phe 



Leu Asn 
685 

Pro Arg 
700 



lie Pro Gly Asn Gin Phe 
715 



lie Ala Phe 
640 

Asp Arg Pro 
655 

Met lie Arg 

670 

Phe Met Gly 



Ala Glu Gin 



Ser Tyr Asp 
720 
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Lys Cys Arg 

Arg Gly Leu 

Tyr Glu Phe 
755 

Gly Asp Arg 
770 

Asn Phe His 
785 

Lys Pro Gly 
Gly Gly Phe 



Gly Trp Tyr 
835 

Lys Thr Ala 
850 

Glu Glu Glu 
865 



Arg Arg 
725 

Gin Glu 
740 
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Phe Asp Leu Gly Asp Ala Glu Tyr Leu Arg Tyr 
730 735 

Phe Asp Arg Pro Met Gin Tyr Leu Glu Asp Lys 
745 750 



Met Thr Ser Glu His Gin Phe He Ser Arg Lys Asp Glu 

760 765 

Met He Val Phe Glu Lys Gly Asn Leu Val Phe Val Phe 
775 780 

Trp Thr Lys Ser Tyr Ser Asp Tyr Arg He Ala Cys Leu 
790 ' 795 800 

Lys Tyr Lys Val Ala Leu Asp Ser Asp Asp Pro Leu Phe 
805 810 815 

Gly Arg He Asp His Asn Ala Glu Tyr Phe Thr Phe Glu 
820 ~ 825 830 

Asp Asp Arg Pro Arg Ser He Met Val Tyr Ala Pro Cys 
840 845 

Val Val Tyr Ala Leu Val Asp Lys Glu Glu Glu Glu Glu 
855 860 

Glu Glu Glu Val Ala Ala Val Glu Glu Val Val Val Glu 
870 875 880 



Glu Glu 



(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2576 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

TCATTAAAGA GGAGAAATTA ACTATGAGAG GATCTCACCA TCACCATCAC CATGGGATCT 
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TGGCTGAAAA GTCTTCTTAC 
120 

AAGTCCTTGT GCCTGGAACC 
180 

TCACTGAGAC ATCTCCAGAA 
240 

AACACGCTAG CCAGATTAAA 
300 

GAAGTGTTGA AGAGCTGGAT 
360 

AGGAGTCTAA AACATTAAAT 
420 

GAGAGAGGGG CATCCCTCCA 
480 

TGACAAACTA TCGTCAACAC 
540 

CAATTGACAA GTATGAGGGT 
600 

TCACTCGTAG TGCTACAGGT 
660 

CCCTCATTGG AGATTTCAAC 
720 

TTGGTGTCTG GGAGATTTTT 
780 

GGTCCAGAGT GAAGATACGT 
840 

GGATCAACTA CTCTACAGCT 
900 

CCCGAAGAGG AGAGGTATAT 
960 

ATATATGAAT CTCATATTGG 
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AATTCCGAAT TCCGACCTTC 

CAGAGTGATA GCTCCTCATC 

AATTCCCCAG CATCAACTGA 

ACTGAGAACG ATGACGTTGA 

TTTGCTTCAT CACTACAACT 

ACTTCTGAAG AGACAATTAT 

CCTGGACTTG GTCAGAAGAT 

CTTGATTACA GGTATTCACA 

GGTTTGGAAG CTTTTTCTCG 

ATCACTTACC GTGAGTGGGC 

AATTGGGACG CAAATGCTGA 

CTGCCAAATA ATGTGGATGG 

ATGGACACTC CATCAGGTGT 

TCCTGATGAA ATTCCATATA 

CTTCCAACAC CCACGGCCAA 

AATGAGTAGT CCGGAGCCTA 
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TACAGTTGCA GCATCGGGGA 
CTCAACAAAC CAATTTGAGT 
TGTAGATAGT TCAACAATGG 
GCCGTCAAGT GATCTTACAG 
ACAAGAAGGT GGTAAACTGG 
TGATGAATCT GATAGGATCA 
TTATGAAATA GACCCCCTTT 
GTACAAGAAA CTGAGGGAGG 
TGGTTATGAA AAAATGGGTT 
TCCTGGTGCC CAGTCAGCTG 
CATTATGACT CGGAATGAAT 
TTCTCCTGCA ATTCCTCATG 
TAAGGATTCC ATTCCTGCTT 
ATGGAATATA TTATGATCCA 
AGAAACCAAA GTCGCTGAGA 
AAATTAACTC ATACGTGAAT 



1020 

TTTAGAGATG AAGTTCTTCC 
1080 

GCTATTCAAG AGCATTCTTA 
1140 

CCAAGCAGCC GTTTTGGAAC 
1200 

CTAGGAATTG TTGTTCTCAT 
1260 

GGACTGAACA TGTTTGACGG 
1320 

CATTGGATGT GGGATTCCCG 
1380 

CTCTCAAATG CGAGATGGTG 
1440 

GTGACATCAA TGATGTATAC 
1500 

GAATACTTTG GACTCGCAAC 
1560 

CTTATTCATG GGCTTTTCCC 
1620 

ACATTTTGTA TTCCCGTTCA 
1680 

ATTGCTGATA AATGGATTGA 
1740 

ATTGTTCATA CACTGACAAA 
1800 

CATGATCAAG CTCTAGTCGG 
1860 

TATGATTTTA TGGCTCTGGA 
1920 

CACAAGATGA TTAGGCTTGT 



1627D.txt 

TCGCATAAAA AAGCTTGGGT 

TTATGCTAGT TTTGGTTATC 

GCCCGACGAC CTTAAGTCTT 

GGACATTGTT CACAGCCATG 

CACCGATAGT TGTTACTTTC 

CCTTTTTAAC TATGGAAACT 

GTTGGATGAG TTCAAATTTG 

TCACCACGGA TTATCGGTGG 

TGATGTGGAT GCTGTTGTGT 

AGATGCAATT ACCATTGGTG 

AGATGGGGGT GTTGGCTTTG 

GTTGCTCAAG AAACGGGATG 

TAGAAGATGG TCGGAAAAGT 

TGATAAAACT ATAGCATTCT 

TAGACCGCCA ACATCATTAA 

AACTATGGGA TTAGGAGGAG 
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ACAATGCGCT GCAAATTATG 
ATGTCACAAA TTTTTTTGCA 
TGATTGATAA AGCTCATGAG 
CATCAAATAA TACTTTAGAT 
ACTCTGGAGC TCGTGGTTAT 
GGGAGGTACT TAGGTATCTT 
ATGGATTTAG ATTTGATGGT 
GATTCACTGG GAACTACGAG 
ATCTGATGCT GGTCAACGAT 
AAGATGTTAG CGGAATGCCG 
ACTATCGGCT GCATATGGCA 
AGGATTGGAG AGTGGGTGAT 
GTGTTTCATA CGCTGAAAGT 
GGCTGATGGA CAAGGATATG 
TAGATCGTGG GATAGCATTG 
AAGGGTACCT AAATTTCATG 



1627D.txt 



1980 

GGAAATGAAT TCGGCCACCC TGAGTGGATT GATTTCCCTA GGGCTGAACA ACACCTCTCT 
2040 

GATGACTCAG TAATTCCCGG AAACCAATTC AGTTATGATA AATGCAGACG GAGATTTGAC 
2100 

CTGGGAGATG CAGAATATTT AAGATACCGT GGGTTGCAAG AATTTGACCG GGCTATGCAG 
2160 

TATCTTGAAG ATAAATATGA GTTTATGACT TCAGAACACC AGTTCATATC ACGAAAGGAT 
2220 

GAAGGAGATA GGATGATTGT ATTTGAAAAA GGAAACCTAG TTTTTGTCTT TAATTTTCAC 
2280 

TGGACAAAAA GCTATTCAGA CTATCGCATA GGCTGCCTGA AGCCTGGAAA ATACAAGGTT 
2340 

GCCTTGGACT CAGATGATCC ACTTTTTGGT GGCTTCGGGA GAATTGATCA TAATGCCGAA 
2400 

TATTTCACCT TTGAAGGATG GTATGATGAT CGTCCTCGTT CAATTATGGT GTATGCACCT 
2460 

TGTAGAACAG CAGTGGTCTA TGCACTAGTA GACAAAGAAG AAGAAGAAGA AGAAGAAGAA 
2520 

GAAGAAGTAG CAGTAGTAGA AGAAGTAGTA GTAGAAGAAG AATGAACGAA CTTGTG 
2576 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2529 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GGATGCTAAT GTTTCTGTAT TCTTGAAAAA GCACTCTCTT TCACGGAAGA TCTTGGCTGA 
60 
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AAAGTCTTCT TACAATTCCG 
120 

TGTGCCTGGA AYCCAGAGTG 
180 

GACATCTCCA GAAAATTCCC 
240 

TAGCCAGATT AAAACTGAGA 
300 

TGAAGAGCTG GATTTTGCTT 
360 

TAAAACATTA AATACTTCTG 
420 

GGGCATCCCT CCACCTGGAC 
480 

CTATCGTCAA CACCTTGATT 
540 

CAAGTATGAG GGTGGTTTGG 
600 

TAGTGCTACA GGTATCACTT 
660 

TGGAGATTTC AACAATTGGG 
720 

CTGGGAGATT TTTCTGCCAA 
780 

AGTGAAGATA CGYATGGACA 
840 

CTACTCTTTA CAGCTTCCTG 
900 

AGAGGAGAGG TATRTCTTCC 
960 

TGAATCTCAT ATTGGAATGA 
1020 
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AATCCCGACC TTCTACAGTT 

ATAGCTCCTC ATCCTCAACA 

CAGCATCAAC TGATGTAGAT 

ACGATGACGT TGAGCCGTCA 

CATCACTACA ACTACAAGAA 

AAGAGACAAT TATTGATGAA 

TTGGTCAGAA GATTTATGAA 

ACAGGTATTC ACAGTACAAG 

AAGCTTTTTC TCGTGGTTAT 

ACCGTGAGTG GGCTCCTGGT 

ACGCAAATGC TGACATTATG 

ATAATGTGGA TGGTTCTCCT 

CTCCATCAGG TGTTAAGGAT 

ATGAAATTCC ATATAATGGA 

AACACCCACG GCCAAAGAAA 

GTAGTCCGGA GCCTAAAATT 
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GCAGCATCGG GGAAAGTCCT 
GACCAATTTG AGTTCACTGA 
AGTTCAACAA TGGAACACGC 
AGTGATCTTA CAGGAAGTGT 
GGTGGTAAAC TGGAGGAGTC 
TCTGATAGGA TCAGAGAGAG 
ATAGACCCCC TTTTGACAAA 
AAACTGAGGG AGGCAATTGA 
GAAAAAATGG GTTTCACTCG 
GCCCAGTCAG CTGCCCTCAT 
ACTCGGAATG AATTTGGTGT 
GCAATTCCTC ATGGGTCCAG 
TCCATTCCTG CTTGGATCAA 
ATATATTATG ATCCACCCGA 
CCAAAGTCGC TGAGAATATA 
AACTCATACG TGAATTTTAG 



AGATGAAGTT CTTCCTCGCA 
1080 

TCAAGAGCAT TCTTATTATG 
1140 

CAGCCGTTTT GGAACGCCCG 
1200 

AATTGTTGTT CTCATGGACA 
1260 

GAACATGTTT GACGGCACAG 
1320 

GATGTGGGAT TCCCGCCTCT 
1380 

AAATGCGAGA TGGTGGTTGG 
1440 

ATCAATGATG TATACTCACC 
1500 

CTTTGGACTC GCAACTGATG 
1560 

TCACGGGCTT TTCCCAGATG 
1620 

TTGTATTCCC GTTCAAGATG 
1680 

TGATAAATGG ATTGAGTTGC 
1740 

TCATACACTG ACAAATAGAA 
1800 

TCAAGCTCTA GTCGGTGATA 
1860 

TTTTATGGCT CTGGATAGAC 
1920 

GATGATTAGG CTTGTAACTA 
1980 
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TAAAAAASCT TGGGTACAAT 

CTAGTTTTGG TTATCATGTC 

ACGACCTTAA GTCTTTGATT 

TTGTTCACAG CCATGCATCA 

ATAGTTGTTA CTTTCACTCT 

TTAACTATGG AAACTGGGAG 

ATGAGTTCAA ATTTGATGGA 

ACGGATTATC GGTGGGATTC 

TGGATGCTGT TGTGTATCTG 

CAATTACCAT TGGTGAAGAT 

GGGGTGTTGG CTTTGACTAT 

TCAAGAAACG GGATGAGGAT 

GATGGTCGGA AAAGTGTGTT 

AAACTATAGC ATYCTGGCTG 

CGYCAACAYC ATTAATAGAT 

TGGGATTAGG AGGAGAAGGG 
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GCGGTGCAAA TTATGGCTAT 
ACAAATTTTT TTGCACCAAG 
GATAAAGCTC ATGAGCTAGG 
AATAATACTT TAGATGGACT 
GGAGCTCGTG GTTATCATTG 
GTACTTAGGT ATCTTCTCTC 
TTTAGATTTG ATGGTGTGAC 
ACTGGGAACT ACGAGGAATA 
ATGCTGGTCA ACGATCTTAT 
GTTAGCGGAA TGCCGACATT 
CGGCTGCATA TGGCAATTGC 
TGGAGAGTGG GTGATATTGT 
TCATMCGCTG AAAGTCATGA 
ATGGACAAGG ATATGTATGA 
CGTGGGATAG CATTGCACAA 
TACCTAAATT TCATGGGAAA 
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TGAATTCGGC CACCCTGAGT GGATTGATTT CCCTAGGGCT GARCAACACC TCTCTGATGG 
2040 

CTCAGTAATT CCCGGAAACC AATTCAGTTA TGATAAATGC AGACGGAGAT TTGACCTGGG 
2100 

AGATGCAGAA TATTTAAGAT ACCATGGGTT GCAAGAATTT GACCGGGCTA TGCAGTATCT 
2160 

TGAAGATAAA TATGAGTTTA TGACTTCAGA ACACCAGTTC ATATCACGAA AGGATGAAGG 
2220 

AGATAGGATG ATTGTATTTG AAARAGGAAA CCTAGTTTTT GTCTTTAATT TTCACTGGAC 
2280 

AAATAGCTAT TCAGACTATC GCATAGGCTG CCTGAAGCCT GGAAAATACA AGGTTGGCTT 
2340 

GGACTCAGAT GATCCACTTT TTGGTGGCTT CGGGAGAATT GATCATAATG CCGAATATTT 
2400 

CACCTCTGAA GGATCGTATG ATGATCGTCC TCGTTCAATT ATGGTGTATG CACCTAGTAG 
2460 

AACAGCAGTG GTCTATGCAC TAGTAGACAA ANTAGAAGNA GAAGAAGAAG AAGAANCCGN 
2520 

NGAAGAATT 
2529 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3231 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

GATTTAATAC GACTCACTAT AGGGATTTTT TTTTTTTTTT TTTTAAAAAC CTCCTCCACT 
60 
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CAGTCTTGGG ATCTCTCTCT 
120 

ACACTCAGTT AGTTACACTC 
180 

CCAAGGAATG AATTAAAAGA 
240 

TCTCTGGAGT TCGTTTTCCT 
300 

ATGGTGATCG GAGGAATGCT 
360 

AGATCTTGGC TGAAAAGTCT 
420 

CGGGGAAAGT CCTTGTACCT 
480 

TTGAGTTCAC TGAGACAGCT 
540 

CAATGGAACA CGCTAGCCAG 
600 

TTACAGGAAG TGTTGAAGAG 
660 

AACTGGAGGA GTCTAAAACA 
720 

GGATCAGAGA GAGGGGCATC 
780 

CCCTTTTGAC AAACTATCGT 
840 

GGGAGGCAAT TGACAAGTAT 
900 

TGGGTTTCAC TCGTAGTGCT 
960 

CAGCTGCTCT CATTGGAGAT 
1020 



1627D.txt 
CTCTTCACGC TTCTCTTGGG 

CTATCACTCA TCAGATCTCT 

TTAGATTTGA AGGAGAGAAG 

ACTGTTCCAT CAGTGTACAA 

AATGTTTCTG TATTCTTGAA 

TCTTACGATT CCGAATCCCG 

GGAATCCAGA GTGATAGCTC 

CCAGAAAATT CCCCAGCATC 

ATTAAAACTG AGAACGATGA 

TTGGATTTTG CTTCATCACT 

TTAAATACTT CTGAAGAGAC 

CCTCCACCTG GACTTGGTCA 

CAACACCTTG ATTACAGGTA 

GAGGGTGGTT TGGAAGCTTT 

ACAGGTATCA CTTACCGTGA 

TTCAACAATT GGGACGCAAA 
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GCCTTGAACT CAGCAATTTG 
ATTTTTTCTC TTAATTCCAA 
AAGAAAGATG GTGTATACAC 
ATCTAATGGA TTCAGCAGTA 
AAAGCACTCT CTTTCACGGA 
ACCTTCTACA GTTGCAGCAT 
CTCATCCTCA ACAGACCAAT 
AACTGATGTG GATAGTTCAA 
CGTTGAGCCG TCAAGTGATC 
ACAACTACAA GAAGGTGGTA 
AATTATTGAT GAATCTGATA 
GAAGATTTAT GAAATAGACC 
TTCACAGTAC AAGAAAATGA 
TTCTCGTGGT TATGAAAAAA 
GTGGGCTCCT GGTGCCCAGT 
TGCTGACATT ATGACTCGGA 



ATGAATTTGG TGTCTGGGAG 
1080 

CTCATGGGTC CAGAGTGAAG 
1140 

CTGCTTGGAT CAACTACTCT 
1200 

ATGATCCACC CGAAGAGGAG 
1260 

CGCTGAGAAT ATATGAATCT 
1320 

ACGTGAATTT TAGAGATGAA 
1380 

AAATTATGGC TATTCAAGAG 
1440 

TTTTTGCACC AAGCAGCCGT 
1500 

CTCATGAGCT AGGAATTGTT 
1560 

CTTTAGATGG ACTGAACATG 
1620 

GTGGTTATCA TTGGATGTGG 
1680 

GGTATCTTCT CTCAAATGCG 
1740 

TTGATGGTGT GACATCAATG 
1800 

ACTACGAGGA ATACTTTGGA 
1860 

CCAACGATCT TATTCATGGG 
1920 

GAATGCCGAC ATTTTGTATT 
1980 



1627D.txt 
ATTTTTCTGC CAAATAATGT 

ATACGCATGG ACACTTCATC 

TTACAGCTTC CTGATGAAAT 

AGGTATGTCT TCCAACACCC 

CATATTGGAA TGAGTAGTCC 

GTTCTTCCTC GCATAAAAAA 

CATTCTTATT ATGCTAGTTT 

TTTGGAACGC CCGACGACCT 

GTTCTCATGG ACATTGTTCA 

TTTGACGGCA CAGATAGTTG 

GATTCCCGCC TCTTTAACTA 

AGATGGTGGT TGGATGAGTG 

ATGTATACTC ACCACGGATT 

CTCGCAACTG ATGTRGATGC 

CTTTTCCCAG ATGCAATTAC 

CCCGTTCAAG ATGGGGGTGT 
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GGATGGTTCT CCTGCAATTC 
AGGTGTTAAG GATTCCATTC 
TCCATATAAT GGAATATATT 
ACGGCCAAAG AAACCAAAGT 
GGAGCCTAAA ATTAACTCAT 
CCTTGGGTAC AATGCGGTGC 
TGGTTATCAT GTCACAAATT 
TAAGTCTTTG ATTGATAAAG 
CAGCCATGCA TCAAATAATA 
TTACTTTCAC TCTGGAGCTC 
TGGAAACTGG GAGGTACTTA 
CAAATTTGRT GGATTTAGAT 
ATCGGTGGGA TTCACTGGGA 
TGCCGTGTAT CTGATGCTGG 
CATTGGTGAA GATGTTAGCG 
TGGCTTTGAC TATCGGCTGC 



ATATGGCAAT TGCTGATAAA 
2040 

TGGGTGATAT TGTTCATACA 
2100 

CTGAAAGTCA TGATCAAGCT 
2160 

AGGATATGTA TGATTTTATG 
2220 

TAGCATTGCA CAAGATGATT 
2280 

ATTTCATGGG AAATGAATTC 
2340 

ACCTCTCTGA TGGCTCAGTA 
2400 

GATTTGACCT GGGAGATGCA 
2460 

CTATGCAGTA TCTTGAAGAT 
2520 

GAAAGGATGA AGGAGATAGG 
2580 

ATTTTCACTG GACAAAAAGC 
2640 

ACAAGGTTGC CTTGGACTCA 
2700 

ATGCCGAATG TTTCACCTTT 
2760 

ATGCACCTAG TAGAACAGCA 
2820 

AAGTAGCAGT AGTAGAAGAA 
2880 

GAAAGATTTG AACGCTACAT 
2940 



1627D.txt 
TGGATTGAGT TGCTCAAGAA 

CTGACAAATA GAAGATGGTC 

CTAGTCGGTG ATAAAACTAT 

GCTTTGGATA GACCGTCAAC 

AGGCTTGTAA CTATGGGATT 

GGCCACCCTG AGTGGATTGA 

ATTCCCGGAA ACCAATTCAG 

GAATATTTAA GATACCGTGG 

AAATATGAGT TTATGACTTC 

ATGATTGTAT TTGAAAAAGG 

TATTCAGACT ATCGCATAGG 

GATGATCCAC TTTTTGGTGG 

GAAGGATGGT ATGATGATCG 

GTGGTCTATG CACTAGTAGA 

GTAGTAGTAG AAGAAGAATG 

AGAGCTTCTT GACGTATCTG 
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ACGGGATGAG GATTGGAGAG 
GGAAAAGTGT GTTTCATACG 
AGCATTCTGG CTGATGGACA 
ATCATTAATA GATCGTGGGA 
AGGAGGAGAA GGGTACCTAA 
TTTCCCTAGG GCTGAACAAC 
TTATGATAAA TGCAGACGGA 
GTTGCAAGAA TTTGACCGGG 
AGAACACCAG TTCATATCAC 
AAACCTAGTT TTTGTCTTTA 
CTGGCTGAAG CCTGGAAAAT 
CTTCGGGAGA ATTGATCATA 
TCCTCGTTCA ATTATGGTGT 
CAAAGAAGAA GAAGAAGAAG 
AACGAACTTG TGATCGCGTT 
GCAATATTGC ATCAGTCTTG 



1627D. txt 

GCGGAATTTC ATGTGACAAA AGGTTTGCAA TTCTTTCCAC TATTAGTAGT GCAACGATAT 
3000 

ACGCAGAGAT GAAGTGCTGA ACAAACATAT GTAAAATCGA TGAATTTATG TCGAATGCTG 
3060 

GGACGGGCTT CAGCAGGTTT TGCTTAGTGA GTTCTGTAAA TTGTCATCTC TTTANATGTA 
3120 

CAGCCCACTA GAAATCAATT ATGTGAGACC TAAAAAACAA TAACCATAAA ATGGAAATAG 
3180 

TGCTGATCTA ATGATGTTTT AANCCNNNNA AAAAAAAAAA AAAAACTCGA G 
3231 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2578 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

TCATTAAAGA GGAGAAATTA ACTATGAGAG GATCTCACCA TCACCATCAC CATGGGATCT 
60 

TGGCTGAAAA GTCTTCTTAC AATTCCGAAT TCCGACCTTC TACAGTTGCA GCATCGGGGA 
120 

AAGTCCTTGT GCCTGGAACC CAGAGTGATA GCTCCTCATC CTCAACAAAC CAATTTGAGT 
180 

TCACTGAGAC ATCTCCAGAA AATTCCCCAG CATCAACTGA TGTAGATAGT TCAACAATGG 
240 

AACACGCTAG CCAGATTAAA ACTGAGAACG ATGACGTTGA GCCGTCAAGT GATCTTACAG 
300 

GAAGTGTTGA AGAGCTGGAT TTTGCTTCAT CACTACAACT ACAAGAAGGT GGTAAACTGG 
360 

AGGAGTCTAA AACATTAAAT ACTTCTGAAG AGACAATTAT TGATGAATCT GATAGGATCA 
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1627D.txt 

420 

GAGAGAGGGG CATCCCTCCA CCTGGACTTG GTCAGAAGAT TTATGAAATA GACCCCCTTT 
480 

TGACAAACTA TCGTCAACAC CTTGATTACA GGTATTCACA GTACAAGAAA CTGAGGGAGG 
540 

CAATTGACAA GTATGAGGGT GGTTTGGAAG CTTTTTCTCG TGGTTATGAA AAAATGGGTT 
600 

TCACTCGTAG TGCTACAGGT ATCACTTACC GTGAGTGGGC TCCTGGTGCC CAGTCAGCTG 
660 

CCCTCATTGG AGATTTCAAC AATTGGGACG CAAATGCTGA CATTATGACT CGGAATGAAT 
720 

TTGGTGTCTG GGAGATTTTT CTGCCAAATA ATGTGGATGG TTCTCCTGCA ATTCCTCATG 
780 

GGTCCAGAGT GAAGATACGT ATGGACACTC CATCAGGTGT TAAGGATTCC ATTCCTGCTT 
840 

GGATCAACTA CTCTTCACAG CTTCCTGATG AAATTCCATA TAATGGAATA TATTATGATC 
900 

CACCCGAAGA GGAGAGGTAT ATCTTCCAAC ACCCACGGCC AAAGAAACCA AAGTCGCTGA 
960 

GAATATATGA ATCTCATATT GGAATGAGTA GTCCGGAGCC TAAAATTAAC TCATACGTGA 
1020 

ATTTTAGAGA TGAAGTTCTT CCTCGCATAA AAAAGCTTGG GTACAATGCG GTGCAAATTA 
1080 

TGGCTATTCA AGAGCATTCT TATTATGCTA GTTTTGGTTA TCATGTCACA AATTTTTTTG 
1140 

CACCAAGCAG CCGTTTTGGA ACGCCCGACG ACCTTAAGTC TTTGATTGAT AAAGCTCATG 
1200 

AGCTAGGAAT TGTTGTTCTC ATGGACATTG TTCACAGCCA TGCATCAAAT AATACTTTAG 
1260 

ATGGACTGAA CATGTTTGAC GGCACCGATA GTTGTTACTT TCACTCTGGA GCTCGTGGTT 
1320 

ATCATTGGAT GTGGGATTCC CGCCTTTTTA ACTATGGAAA CTGGGAGGTA CTTAGGTATC 
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1380 

TTCTCTCAAA TGCGAGATGG 
1440 

GTGTGACATC AATGATGTAT 
1500 

AGGAATACTT TGGACTCGCA 
1560 

ATCTTATTCA TGGGCTTTTC 
1620 

CGACATTTTG TATTCCCGTT 
1680 

CAATTGCTGA TAAATGGATT 
1740 

ATATTGTTCA TACACTGACA 
1800 

GTCATGATCA AGCTCTAGTC 
1860 

TGTATGATTT TATGGCTCTG 
1920 

TGCACAAGAT GATTAGGCTT 
1980 

TGGGAAATGA ATTCGGCCAC 
2040 

CTGATGACTC AGTAATTCCC 
2100 

ACCTGGGAGA TGCAGAATAT 
2160 

AGTATCTTGA AGATAAATAT 
2220 

ATGAAGGAGA TAGGATGATT 
2280 

ACTGGACAAA AAGCTATTCA 



1627D.txt 

TGGTTGGATG AGTTCAAATT 

ACTCACCACG GATTATCGGT 

ACTGATGTGG ATGCTGTTGT 

CCAGATGCAA TTACCATTGG 

CAAGATGGGG GTGTTGGCTT 

GAGTTGCTCA AGAAACGGGA 

AATAGAAGAT GGTCGGAAAA 

GGTGATAAAA CTATAGCATT 

GATAGACCGC CAACATCATT 

GTAACTATGG GATTAGGAGG 

CCTGAGTGGA TTGATTTCCC 

GGAAACCAAT TCAGTTATGA 

TTAAGATACC GTGGGTTGCA 

GAGTTTATGA CTTCAGAACA 

GTATTTGAAA AAGGAAACCT 

GACTATCGCA TAGGCTGCCT 
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TGATGGATTT AGATTTGATG 
GGGATTCACT GGGAACTACG 
GTATCTGATG CTGGTCAACG 
TGAAGATGTT AGCGGAATGC 
TGACTATCGG CTGCATATGG 
TGAGGATTGG AGAGTGGGTG 
GTGTGTTTCA TACGCTGAAA 
CTGGCTGATG GACAAGGATA 
AATAGATCGT GGGATAGCAT 
AGAAGGGTAC CTAAATTTCA 
TAGGGCTGAA CAACACCTCT 
TAAATGCAGA CGGAGATTTG 
AGAATTTGAC CGGGCTATGC 
CCAGTTCATA TCACGAAAGG 
AGTTTTTGTC TTTAATTTTC 
GAAGCCTGGA AAATACAAGG 



1627D.txt 

2340 

TTGCCTTGGA CTCAGATGAT CCACTTTTTG GTGGCTTCGG GAGAATTGAT CATAATGCCG 
2400 

AATATTTCAC CTTTGAAGGA TGGTATGATG ATCGTCCTCG TTCAATTATG GTGTATGCAC 
2460 

CTTGTAGAAC AGCAGTGGTC TATGCACTAG TAGACAAAGA AGAAGAAGAA GAAGAAGAAG 
2520 

AAGAAGAAGT AGCAGTAGTA GAAGAAGTAG TAGTAGAAGA AGAATGAACG AACTTGTG 
2578 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

AATTTYATGG GNAAYGARTT YGG 
23 
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